Method for detecting and estimating the spatial position of objects from a two-dimensional image

ABSTRACT

Use is made of an adaptive vector quantization in order to detect three-dimensional objects and to estimate their position parameters in space from two-dimensional coordinates of feature points which have been obtained by digital image preprocessing methods from a two-dimensional image of the three-dimensional object. For this purpose, a special learning method and a special method for object detection and for position estimation are specified. The method can be applied in wide fields of video-based production automation.

BACKGROUND OF THE INVENTION

Methods for mechanized detection of three-dimensional objects and theirposition in space are of fundamental importance in most applications ofmechanized image processing and of image interpretation in fields ofapplication such as robotics and production automation. In many of thesecases of application, depth data is not available, and also cannot bedetermined with the aid of stereo images or temporal image sequences inwhich the object movement or the camera movement could be evaluated inorder to obtain depth data. In such cases, it is necessary to use asingle two-dimensional image of the object to be detected in order toreconstruct the object identity and its position in space from thetwo-dimensional image. For this purpose, it is generally necessary touse models for the objects to be detected and a model for the imagingprocess. For reasons of simplification, objects are frequently modelledwith the aid of a corner point and edge lines. The optical imaging ofspace into the image plane can be approximated in many cases withsufficient accuracy, for example by central projection. The technicalproblem in detecting and estimating the position parameters ofthree-dimensional objects in space from a two-dimensional image,frequently also referred to as the n-point perspective problem, consistsin determining the position and spatial arrangement of athree-dimensional object from n feature points which have been foundwith the aid of suitable preprocessing methods in a two-dimensionalimage of the object to be detected. Examples for such feature pointsare, inter alia, images of object points or images of other prominentpoints on the object surface.

The literature discloses various methods for the positional estimationof three-dimensional objects in space. Essentially, it is possible todistinguish two types of method: on the one hand, there are approachesto the analytical solution of the n-point perspective problem whichconcern themselves with the minimum number of correspondences which arerequired in order to solve the perspective problem of rigid objects,that is to say the problem of assignment between object points andfeature points in the image plane. In these types of method, therelationship, which models the image, between points in space and pointsin the image plane, for example central projection, is used to set up asystem of equations. Depending on the number of feature points in theimage plane, this system of equations can be underdetermined, uniquelysolvable or overdetermined. These methods therefore generally use asubset of all the available feature points in the image plane, whichlead to a uniquely solvable and well-conditioned system of equations.Such a method is described, for example, in the publication by R. M.Haralick, "Using Perspective Transformations in Scene Analysis",Computer Vision, Graphics and Image Processing 13, 1980, pages 191- 221.Methods of this type generally lead to overdetermined or underdeterminedsystems of equations, and therefore require preprocessing of the imagewith the aim of selecting a suitable set of feature points, and aregenerally not very robust with respect to disturbances.

A second type of method exploits the iterative solution ofoverdetermined systems of equations with the aid of nonlinearminimization methods. In this case, a cost function which is determinedby means of correspondences previously found between image features andmodel parameters is minimized. An example of this second type of methodis described in the publication by D. G. Lowe "Three-Dimensional ObjectRecognition from Single Two-Dimensional Images", Artificial Intelligence31, 1987, pages 355-395. This second type of method is certainly farless sensitive to disturbances, but like the first type of method hasthe disadvantage that a nonlinear analytical model of the opticalimaging process is required and must be processed using numericalmethods.

SUMMARY OF THE INVENTION

It is the object of the invention to specify a method which solves theproblems specified above and avoids the disadvantages of the knownmethods. Moreover, the invention has the aim of specifying a method bymeans of which not only the position of a known object can be estimatedfrom a two-dimensional image, but by means of which it is possible todetect an unknown object belonging to a prescribed set of known objectsand to estimate its spatial position. These objects are achieved bymeans of a method for estimating position parameters of an object inspace from a two-dimensional image, and by means of a method for objectdetection in conjunction with simultaneous estimation of situationparameters of the object in space from a two-dimensional image.

The various methods of the present invention are described in generalterms as follows.

A method of the present invention for estimating position parameters ofan object in space from a two-dimensional image has the followingfeatures:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated position parametersx(i)=(x(i, 1), . . . , x(i, N)) are prescribed;

b) a master vector v=(v(1), . . . , v(M)) in the form of v(k)=(p(k),q(k)) from which output values a(i)=a(v,m(i)) are calculated for eachindex i with the aid of the reference patterns m(i) is formed fromtwo-dimensional coordinates p(k), q(k) of feature points in the image;

c) the desired position parameters of the object belong to the index wwith optimum output value a(w).

A method of the present invention for object detection in conjunctionwith simultaneous estimation of position parameters of an object inspace from a two-dimensional image has the following features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=m(i, 1), . . . , m(i, M)) and associated interpretations z(i)=(z(i,1), . . . , z(i, N)) are prescribed;

b) a master vector v=(v(1), . . . , v(M)) in the form v(k)=(p(k), q(k))from which output values a(i)=a(v,m(i)) are calculated for each index iwith the aid of the reference patterns m(i) is formed fromtwo-dimensional coordinates p(k), q(k) of feature points in the image;

c) the desired object identification and the associated positionparameters are yielded as components of the interpretations z(w)relating to the index w with optimal output value a(w).

A further method of the present invention for estimating positionparameters of an object in space from a two-dimensional image has thefollowing features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated position parametersx(i)=(x(i, 1), . . . , x(i, N)) are prescribed;

b) each index i is assigned a point (i(1), i(2), i(3)) on a grid inthree-dimensional space in a reversible unique fashion;

c) a master vector v=(v(1), . . . , v(M)) from which output valuesa(i)=a(v,m(i)) are calculated for each index i with the aid of thereference patterns m(i) is formed from two-dimensional coordinates offeature points in the image;

d) the index or grid point w with optimum output value a(w) isdetermined;

e) the optimum of the function

    A(u,b)=a(v,b·m(w)+(1-b)·m(u))

of the real variable b, where 0≦b≦1, is determined for all grid points ufrom a prescribed environment U of the grid point w;

f) from among all the grid points u of the environment U of the gridpoint w that grid point opt is determined for which it holds that:

A(opt, b(opt)) is optimal among all A(u,b(u)), b(u) denoting theposition of the optimum of A(u,b) as a function of b in the interval0≦b≦1;

g) the desired position parameters x of the object are yielded with theaid of the relationship

    x=b(opt)·x (w)+(1-b(opt))·x (opt)

as a convex linear combination of the position parameters of two gridpoints.

Another method of the present invention for estimating positionparameters of an object in space from a two-dimensional image has thefollowing features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated position parametersx(i)=(x(i, 1), . . . , x(i, N)) are prescribed;

b) each index i is assigned a point (i(1), i(2), i(3)) on a grid inthree-dimensional space in a reversible unique fashion;

c) a master vector v=(v(1), . . . , v(M)) from which output valuesa(i)=a(v,m(i)) are calculated for each index i with the aid of thereference patterns m(i) is formed from two-dimensional coordinates offeature points in the image;

d) the index or grid point w with optimum output value a(w) isdetermined;

e) for all grid points u from a prescribed environment U of the gridpoint w, numbers b(u, opt) and a number b(w, opt) , where 0≦b(u, opt) ,b(w, opt)≦1, are determined, for which the function ##EQU1## is optimum;f) the desired position parameters x of the object are yielded with theaid of the relationship ##EQU2## as a convex linear combination of theposition parameters x(w) and x(u).

A further method of the present invention for object detection inconjunction with simultaneous estimation of position parameters of theobject in space from a two-dimensional image has the following features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated interpretationsz(i)=(z(i, 1), . . . , z(i, N)) are prescribed, the components of theinterpretations corresponding to an object recognition and to positionparameters of the object;

b) each index i is assigned a point (i(1), i(2), i(3)) on a grid inthree-dimensional space in a reversible unique fashion;

c) a master vector v=(v(1), . . . , v(M)) from which output valuesa(i)=a(v,m(i)) are calculated for each index i with the aid of thereference patterns m(i) is formed from two-dimensional coordinates offeature points in the image;

d) the index or grid point w with optimum output value a(w) isdetermined;

e) the optimum of the function

    A(u,b)=a(v,b·m(w)+(1-b)·m(u))

of the real variable b, where 0≦b≦1 is determined for all grid points Ufrom a prescribed environment U of the grid point w;

f) from among all the grid points u of the environment U of the gridpoint w that grid point opt is determined for which it holds that:

A(opt, b(opt)) is optimal among all A(u,b(u)), b(u) denoting theposition of the optimum of A(u,b) as a function of b in the interval0≦b≦1;

g) the desired object identification is yielded from the correspondingcomponent of the interpretation z(w) relative to the index w withoptimum output value a(w);

h) the desired position parameters z are yielded with the aid of therelationship

    z=b(opt)·z(w)+(1-b(opt))·z(opt)

as a convex linear combination of the position parameters of two gridpoints.

Another method of the present invention for object detection inconjunction with simultaneous estimation of position parameters of anobject in space from a two-dimensional image has the following features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated interpretationsz(i)=(z(i, 1), . . . , z(i, N)) are prescribed, the components of theinterpretations corresponding to an object recognition and to positionparameters of the object;

b) each index i is assigned a point (i(1), i(2), i(3)) on a grid inthree-dimensional space in a reversible unique fashion;

c) a master vector v=(v(1), . . . , v(M)) from which output valuesa(i)=a(v,m(i)) are calculated for each index i with the aid of thereference patterns m(i) is formed from two-dimensional coordinates offeature points in the image;

d) the index or grid point w with optimum output value a(w) isdetermined;

e) for all grid points u from a prescribed environment U of the gridpoint w, numbers b(u, opt) and a number b(w, opt), where 0≦b(u, opt),b(w, opt)≦1, are determined, for which the function ##EQU3## is optimum;f) the desired object identification is yielded from the correspondingcomponents of the interpretation z(w) relative to the index w withoptimum output value a(w);

g) the desired position parameters z are yielded with the aid of therelationship ##EQU4## as a convex linear combination of thecorresponding components of the interpretations Z(w) and Z(u).

A method of the present invention for adapting reference patterns forestimating position parameters of an object in space from atwo-dimensional image has the following features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated position parametersx(i)=(x(i, 1), . . . , x(i, N)) are prescribed and randomly distributed;

b) each index i is assigned a point (i(1), i(2), i(3)) on a grid inthree-dimensional space in a reversible unique fashion;

c) for each instant t of the adaptation

a learning rate L(t) is prescribed which decreases monotonically withtime;

a coupling strength function h(i, j, t) is prescribed which assigns totwo points (i(1), i(2), i(3)), (j(1), j(2), j(3)) on the grid in eachcase a coupling strength h which decreases monotonically with thespacing of the grid points and monotonically with time;

a training master vector v(t)=(v(1, t), . . . , v(M, t)), which isformed from two-dimensional coordinates of feature points in the image,and position parameters XT(t)=(XT(1, t), . . . , XT(N, t)) belonging tothis training master vector are prescribed;

for each index i the instantaneous output values a(i, t)=a (v(t), m(i,t)) are calculated from the prescribed training master vector v(t) andthe instantaneous reference patterns m(i, t)=(m(i, 1, t), . . . , m(i,M, t));

the index w(t) with optimum output value a(w, t) is determined;

the reference patterns m(i, t) and the position parameters x(i, t)relating to each index i are adapted with the aid of the relationships

    m(i, t+1)=m(i, t)+L(t)·h(i, w, t)·(v(t)-m(i, t))

    x(i, t+1)=x(i, t)+L(t)·h(i, w, t)·(XT(t)-x(i, t)).

A method of the present invention for adapting reference patterns forobject detection in conjunction with simultaneous estimation of positionparameters of an object in space from a two-dimensional image has thefollowing features:

a) two totalities, indexable with whole numbers i, of reference patternsm(i)=(m(i, 1), . . . , m(i, M)) and associated interpretationsz(i)=(z(i, 1), . . . , z(i, N)) are prescribed and randomly distributed,the components of the interpretations corresponding to an objectdetection and to position parameters of the object;

b) each index i is assigned a point (i(1), i(2), i(3)) on a grid inthree-dimensional space in a reversible unique fashion;

c) for each instant t of the adaptation

a learning rate L(t) is prescribed which decreases monotonically withtime;

a coupling strength function h(i, j, t) is prescribed which assigns totwo points (i(1), i(2), i(3)), (j(1), j(2), j(3)) on the grid in eachcase a coupling strength h which decreases monotonically with thespacing of the grid points and monotonically with time;

a training master vector v(t)=(v(1, t), . . . , v(M, t)), which isformed from two-dimensional coordinates of feature points in the image,and interpretations ZT(t)=(ZT(1, t), . . . , ZT(N, T)) belonging to thistraining master vector are prescribed;

for each index i the instantaneous output values a(i, t)=a (v(t), m(i,t)) are calculated from the prescribed training master vector v(t) andthe instantaneous reference patterns m(i, t)=(m(i, 1, t), . . . , m(i,M, t));

the index w(t) with optimum output value a(w, t) is determined;

the reference patterns m(i, t) and the interpretations z(i, t) relatingto each index i are adapted with the aid of the relationships

    m(i, t+1)=m(i, t)+L(t)·h(i, w, t)·(v(t)-m(i, t))

    z(i, t+1)=z(i, t)+L(t)·h(i, w, t)·(ZT(t)-z(i, t)).

In the last two above-described methods a learning rate L(t) can be usedof the form ##EQU5## with prescribed initial and final learning ratesL(start) and L(end) respectively, up to the final instant t(end) of theadaptation. A coupling strength can be used function h(i, j, t) of theform ##EQU6## and with prescribed initial and final ranges s(start) ands(end), respectively, up to the final instant t(end) of the adaptation.

The master vectors v=(v(1), . . . , v(M)) in the form of v(k)=(p(k),q(k)) are formed from the two-dimensional coordinates p(k), q(k) offeature points k=1, . . . , M.

The output values a(i) are calculated with the aid of the relationship##EQU7## from the master vector v and the reference patterns m(i), andin that minimum output values are optimal.

The output values a(i) are calculated with the aid of the relationship##EQU8## from the master vector v and the reference patterns m(i), theweights g(v, k) being a measure of the reliability of the values of thecomponents v(k) of the master vector v, and minimum output values beingoptimal.

The output values a(i) are calculated with the aid of the relationship##EQU9## from the master vector v and the reference patterns m(i), andin that maximum output values are optimal.

Among the position parameters are rotational angles referred to threeaxes of a spatial coordinate system.

Since these methods do not use any kind of analytical models of theoptical imaging process, nor yet any explicit object models, theirapplication presupposes a learning process with the aid of suitablelearning methods. The invention therefore also comprises a method foradapting reference patterns to the estimation of position parameters ofan object in space from a two-dimensional image, and a method foradapting reference patterns to object recognition in conjunction withsimultaneous estimation of position parameters of the object in spacefrom a two-dimensional image. The method according to the invention isbased on the method of adaptive vector quantization, which can also beformulated as a so-called Kohonen-network with the aid of concepts fromthe field of neuron networks (T. Kohonen, "Representation of SensoryInformation in Self-Organizing Feature Maps, and Relation of these Mapsto Distributed Memory Networks", Proc. SPIE Optical and HybridComputing, vol. 634, pages 248 to 259, 1986). In this case, there areformed from the two-dimensional coordinates of feature points in theimage plane so-called master vectors which are compared with storedreference patterns which are also referred to in the language of neuronnetworks as weighting coefficients. Each reference pattern is in thiscase assigned an interpretation by means of which it is possible to drawconclusions directly concerning the position parameters of the object tobe detected and concerning the object identity. The desired positionparameters or the desired object identity are yielded in this case as aninterpretation of that reference pattern which is most similar to themaster vector representing the feature points of the image plane.

The accuracy and reliability of this mode of procedure can be furthersubstantially improved when, instead of the interpretation of thereference pattern most similar to the master vector an estimate ofposition is carried out by reference pattern interpolation.

The correct or optimum assignment between reference patterns andinterpretation or position parameters and object identities is foundwith the aid of a learning method, in which the totality of thereference patterns and interpretations is varied in steps under theinfluence of a large number of training master vectors, starting from aninitial random selection. At the end of this learning process, thereference patterns represent the statistical distribution of thetraining patterns, and the structural characteristics of the objects tobe detected are coded, in common with the structural characteristics ofthe optical imaging process, in the assignment between the referencepatterns and interpretations.

BRIEF DESCRIPTION OF THE DRAWINGS

The features of the present invention which are believed to be novel,are set forth with particularity in the appended claims. The invention,together with further objects and advantages, may best be understood byreference to the following description taken in conjunction with theaccompanying drawings, in the several Figures of which like referencenumerals identify like elements, and in which:

FIG. 1 shows a cuboid which is projected into the image plane and hasthe corner points 1, 2, . . . , 7 and the two-dimensional coordinatesq(1), . . . , p(3) belonging to the corner points 1 and 3, respectively.

FIG. 2 shows a flow diagram of the method for positional estimation andfor object detection in conjunction with simultaneous positionalestimation.

FIG. 3 shows a flow diagram of the method for adaptation of referencepatterns for estimating position parameters or for detecting objects inconjunction with simultaneous estimation of position parameters.

DESCRIPTION OF THE PREFERRED EMBODIMENT

The invention is explained in more detail below with the aid of apreferred exemplary embodiment.

For typical applications, for example in robotics, two-dimensionalimages of three-dimensional objects are generated with the aid ofelectronic cameras. In this case, the three-dimensional space isprojected with the aid of optical imaging into a two-dimensional imageplane. The structural characteristics of this optical imaging processdepend on the optical characteristics of the camera used, and cangenerally be described sufficiently accurately by the mathematical modelof central projection (also termed perspective projection). In centralprojection, each space point X, Y, Z is assigned a point p, q in theimage plane as follows:

    (p, q)=(f·X/Z, f·Y/Z)                    (1)

f denoting the focal length of the optical imaging system (Y. Shirai"Three-Dimensional Computer Vision", Springer-Verlag, Berlin,Heidelberg, New York, 1987, pages 11-13). The result of such opticalimaging with the aid of an electronic camera is finally atwo-dimensional projection of the three-dimensional object, which ispresent in a digital image memory.

With the aid of various methods of digital image preprocessing, it ispossible to obtain from such an image, which can be understood as atwo-dimensional matrix of pixels, prominent points in the image planewhich are also referred to as feature points. FIG. 1 shows athree-dimensional cuboid with visible corner points 1, 2, . . . , 7which is projected into the two-dimensional image plane, whosecoordinate axes are denoted by q and p. Finding the corner points of thecuboid in the image plane with the aid of suitable methods of digitalimage preprocessing makes the two-dimensional coordinates available forpositional estimation or object detection. An overview of the methodsavailable for feature extraction is to be found in Y. Shirai (1987). Theimages of object corner points are chiefly suitable as feature points.These object corner points can be found in the image plane in anadvantageous way as points of intersection of object edges. This reducesthe problem of locating corner points in the image plane to thedetection of edges. A number of methods are available for detectingobject edges in the image plane, of which the most important are basedon edge operators or on the Hough transformation (Y. Shirai 1987).

In order in this way to be able to interpret preprocessed digital imagesas objects, it is necessary to find the correct assignment between thepoints of a corresponding object model and the feature points in theimage plane. This problem is considerably compounded by the unknownposition and unknown spatial orientation of the projected objects. Forthis reason, a knowledge-based and model-based interpretation of thepreprocessed images must be preceded by suitable positional estimation,the aim of which is the determination of position parameters of theprojected object, that is to say of rotation parameters and, possibly,of a displacement.

The solution of this problem according to the invention utilizes themethod of adaptive vector quantization, which is also referred to as theKohonen network. Since this is an adaptive method, the technicalapplication of this method presupposes a suitable learning method. Sucha learning method is therefore described below with reference to FIG. 3.Carrying out this learning method requires a space grid of neurons, thatis to say a totality RG, indexable with whole numbers i, of referencepatterns m(i), each index i of the totality being assigned a grid point(i(1), i(2), i(3)) with whole numbers i(1), i(2) and i(3) in areversible unique fashion. Given a prescribed master vector v, eachneuron, that is to say each element i of the totality, assigns to thereference pattern m(i) of this vector with the aid of an output functiona(v, m) which is identical for all neurons, an output value a(v, m(i))belonging to it. At the start of the learning process, the values of allthe reference patterns m(i) are chosen entirely randomly.

In order to be able with the aid of such a Kohonen network to carry outpositional estimation, each neuron i is assigned an interpretationx(i)=(x(i, 1), . . . , x(i, N)). At the start of the learning process,that is to say the adaptation of the reference patterns, the values ofthe components of all the interpretations x(i) are likewise chosenentirely randomly. In this context, the components of theinterpretations x(i) correspond to the position parameters to beestimated of the projected object.

In order to carry out the adaptation, a time-dependent learning rateL(t) is prescribed which decreases monotonically with time. Thislearning rate determines the rate of the learning process and becomessmaller the further advanced the learning process is. In experimentalinvestigations of the method according to the invention, a learning ratefunction of the form ##EQU10## has proved to be advantageous.

In this case, L(start), L(end) and t(end) denote a prescribed initiallearning rate, a prescribed final learning rate and a final instant ofthe learning process. Furthermore, a coupling strength function h(i, j,t) is prescribed, which assigns to two points (i(1), i(2), i(3)), (j(1),j(2), j(3)) on the grid in each case a coupling strength h whichdecreases monotonically with the spacing of the grid points andmonotonically with time. Experimental investigations have shown that aparticularly advantageous selection of the coupling strength function isgiven by ##EQU11## In this case, s(t) denotes the instantaneous range ofthe coupling strength function h, and the symbols s(start) and s(end)denote prescribed initial and final ranges respectively of the couplingstrength function.

This choice of coupling strength function ensures that the couplingstrength between two neurons decreases monotonically with the spacingbetween these neurons and that for a given spacing the coupling strengthdecreases monotonically with time. As a result, the coupling betweenadjacent neurons on the grid becomes continually shorter in range withadvancing time and continually weaker for a given spacing. It isadvantageous to choose the Euklidian spacing as a measure of the spacingbetween the neurons.

After the parameters of the learning method have been prescribed, theactual learning method now begins. Each time step t is prescribed atraining master vector v(t)=(v(1, t), . . . v(M, t)) which is formedfrom two-dimensional coordinates of feature points in the image, andposition parameters XT(t)=(XT(1, t), . . . , XT(N, t)) assigned to thistraining master vector. At each time step t, the output function a(v, m)which is identical for all the neurons, is now used to calculate fromthis prescribed training master vector v(t) and the instantaneous valuesof the reference patterns m(i, t)=m(i, 1, t), ... m(i, M, t)) theinstantaneous output values a(i, t)=a(v(t), m(i, t)) for each index i.This output value is a measure of the similarity between theinstantaneous value of the reference pattern m(i, t) of the neuron i andthe applied training master vector v(t). In order to determine thatneuron in the totality of neurons whose reference pattern m is mostsimilar to the applied training master vector, that index w(t) is soughtwhose associated output value a(w, t) is optimum.

After determination of the optimum output value and of the index w(t)belonging to it, the actual adaptation step belonging to this instant tis carried out: the instantaneous values of the reference patterns m(i,t) of all the neurons i and the associated position parameters x(i, t)are adapted with the aid of the relationships

    m(i, t+1)=m(i, t)+L(t)·h(i, w, t)·(v(t)-m(i,t)) (5)

    x(i, t+1)=x(i, t)+L(t)·h(i, w, t)·(XT(t)-x(i, t)). (6)

The adaptation relationships (5) and (6) ensure that the distribution ofthe values of the reference patterns becomes ever more similar to thestatistical distribution of the training master vectors in the course ofthe learning process, and that the relationships, corresponding to thetraining master vectors and their associated position parameters,between the master vectors and position parameters are reflected by therelationship occurring during the adaptation between the positionparameters of the neurons and the associated reference patterns. Thechoice of a suitable output function a(v, m) is decisive in order tooptimize the convergence of the learning method. Various experimentalinvestigations have shown that chiefly two different possible choices ofthe output function a(v, m) lead to optimum efficiency of the process.The first possible choice is the square of the Euklidian spacing betweenthe master vector and the reference pattern vector of a neuron,##EQU12## It now follows from the relationships (5), (6) and (7) that byminimizing the output values when determining the optimum output valuea(m, t) the convergence of the learning process can be ensured in thesense of an adaptive vector quantization.

In such cases, in which the Euklidian norm of the master vectors isconstant, it is advantageous to work with normalized reference patternsm(i). In this case, instead of the Euklidian spacing it is possible tochoose another output function which is given by ##EQU13## When theoutput function according to equation (8) is used, the optimum outputvalue is the maximum output value.

This is bound up with the fact that the master vector with thecomponents v(k), k=1, . . . , M becomes more similar to the referencepattern m(i) with the components m(i,k), k=1, . . . , m, the larger theskalar product, given by equation (8), of the master vector v and thereference pattern m.

Once the learning method is concluded, the trained neuron network (or,in other words, the adapted totality of reference patterns) can be usedfor the positional estimation of three-dimensional objects. For thispurpose, feature points are extracted from a given two-dimensional imageof a three-dimensional object using the methods already described forpreprocessing digital images (Y. Shirai 1987), and the two-dimensionalcoordinates of these feature points are combined to form a master vectorv. This can preferably be performed by choosing the components v(k) ofthe master vector v in the form

    v(k)=(p(k), q(k)) k=1, . . . , M                           (9)

p(k) and q(k) being the two-dimensional coordinates of the feature pointk in the image plane (FIG. 1). In order to estimate the spatial positionof the three-dimensional object represented by this master vector v,output value a(i)=a(v,m(i)) is now calculated in accordance with FIG. 2for each index i of the totality of the neurons. Thereupon, the index wof that neuron is sought which has delivered the optimum output valuea(w). The components of the interpretation x(w) of this neuron with theindex w can now be regarded as the optimum estimation of the positionparameters of the object represented by the master vector v.

The quality of this estimation can, however, be further noticeablyimproved if instead of the interpretation x(w) of the optimum neuronwith the index w an interpolated interpretation is used for thepositional estimation. Such an interpolated interpretation can bedescribed as follows:

As a consequence of the special characteristics of the reference patternadaptation already described, there are situated in the immediatevicinity of the optimum neuron with the index w other neurons whosereference patterns are likewise similar to the applied master vector v,even though the output values belonging to these neurons are onlysuboptimum. Guided by the idea that the possible master vectors v form acontinuum of vectors in an M-dimensional space, it can be assumed ingeneral that it is possible to find a reference pattern vector byinterpolation between the reference pattern vectors of the neurons inthe immediate vicinity of the neuron with optimum output value.

The following steps are carried out in order to carry out thisinterpolation:

An environment U of the neuron w with optimum output value a(w) ischosen, which preferably consists of the nearest neighbors of the neuronw with optimum output value. Subsequently, the convex linear combination

    m(w,u,b)=b·m (w)+(1-b)·m(u)              (10)

is formed for each neuron u of this environment U. In this case, b is anumber between 0 and 1. A parameterized output function a(v,m(w,u,b)) iscalculated for this interpolated reference pattern vector m(w,u,b)parameterized by b, which output function is likewise parameterized byb. It is preferable to use once again one of the two forms specified inequations (7) and (8), respectively, as output function. Because of thesimple analytical form of the described output functions, it is possibleto make a concerted effort to find an optimum value of the outputfunction parameterized by b as a function of b. This optimum value of bis denoted for each neuron u of the environment by b(u). This yields forall the neurons of the environment a numerical value b(u) between 0 and1 and an interpolated reference pattern m(w,u,b(u)) belonging thereto.The optimum interpolated reference pattern is now found among theseinterpolated reference patterns by calculating the associated outputvalues a(v,m(w,u,b(u)) for all the interpolated reference patternsm(w,u,b(u)) of all the neurons u of the environment. The optimum andthus improved interpolated reference pattern vector m(w, opt) is nowyielded as that reference pattern vector with optimum associated outputvalue a(v,m(w,u,b(u)). Denoting the interpolation coefficient belongingto this reference pattern by b(opt), an improved positional estimation xis yielded according to the formula

    x=b(opt)·x(w)+(1-b(opt))·x(opt)          (11)

x(opt) being the interpretation belonging to the neuron with optimuminterpolated reference pattern.

Experimental investigations have shown that it was possible to improvethe accuracy of positional estimation by approximately 85% by using theinterpolation method described here.

An even further-reaching improvement in positional estimation can beachieved by another type of interpolation:

For this purpose, all the reference patterns m(u) with u from aprescribed environment U of b are used to form the convex linearcombination ##EQU14## A search is subsequently made for the optimum ofthe function a(v,m) as a function of the coefficients b(u) and b(w). Theoptimum coefficients b(u, opt) and b(w, opt) are then used to calculatethe interpolated positional estimate as ##EQU15##

The substantial insensitivity of the method for positional estimationwith respect to errors in the determination of the coordinates offeature points can be further improved by using instead of the outputfunction (7) the output function ##EQU16## in which the weights g(v,k)are a measure of the reliability or accuracy of the values of thecomponents v(k) of the master vector v. It is expedient for this purposeto write g(v,k)=1 if v(k) is present without uncertainty, and g(v,k)=0if the value of v(k) is unknown. g(v,k) lies between 0 and 1 for allcases between these extremes.

The method described here can still be applied successfully even whenthe aim is not only to estimate the spatial position and orientation ofa known object, but the aim in addition is to determine simultaneouslywith the position and orientation of this object in space the identityof an object, which is unknown a priori and is taken from a known set ofobjects. In order to widen the described method by this scope ofperformance, a further component, which corresponds to the objectidentity, is to be added to the components of the interpretations whichcorrespond to the position parameters to be estimated. When carrying outthe learning method, it is then necessary to specify the relevant objectidentity for each training master vector in addition to the positionparameters belonging to this training master vector. After completion ofthe learning method thus carried out, the neuron network is thus capableof determining the object identity in addition to the unknown positionparameters.

However, it is to be borne in mind in this connection that when usingthe interpolation methods only the components of the interpretationscorresponding to the position parameters are interpolated, since alinear combination of object characteristics is meaningless.

The invention is not limited to the particular details of the methoddepicted and other modifications and applications are contemplated.Certain other changes may be made in the above described method withoutdeparting from the true spirit and scope of the invention hereininvolved. It is intended, therefore, that the subject matter in theabove depiction shall be interpreted as illustrative and not in alimiting sense.

What is claimed is:
 1. A method for estimating position parameters of anobject in space from a two-dimensional image using a Kohonen networkhaving neurons, comprising the steps of:a) prescribing indexableweighting vectors m(i)=(m(i,1), . . . , m(i,M)) of a dimension M andpositional parameters x(i)=(x(i,1), . . . , x(i,N)) of a dimension N,allocated to the weighting factors, by whole numbers i thatunambiguously identify each neuron i of the Kohonen network; b) forminga pattern vector v=(v(1), . . . , v(k), . . . , v(M)) fromtwo-dimensional coordinates p(k), g(k) of feature points in the image,whereby v(k) describes a respective feature point in the image in arespective form v(k)=(p(k),q(k)); c) applying the pattern vectorv=(v(1), . . . v(k), . . . , v(M)) to each neuron i of the Kohonennetwork; d) calculating a respective output value a(i)=a(v,m(i)) for theapplied pattern vector v=(v(1), . . . , v(k), . . . , v(M)) in eachneuron i of the Kohonen network; and e) deriving sought positionalparameters from respective positional parameters x(i)=(x(i, 1), . . . ,x(i,N)) allocated to a neuron i having an optimum output valuea(v,m(i)).
 2. The method according to claim 1 for object detection inconjunction with simultaneous estimation of position parameters of anobject in space from a two-dimensional image using a Kohonen networkhaving neurons, wherein the method further comprises using, in additionto the positional parameters, object identities for recognition ofobjects in the step of adapting.
 3. The method as claimed in claim 1,wherein the output values a(i) are calculated using the relationship##EQU17## from the pattern vector v and the weighting vectors m(i), andwherein minimum output values are optimal.
 4. The method as claimed inclaim 1, wherein the output values a(i) are calculated using therelationship ##EQU18## from the pattern vector v and the weightingfactors m(i), weights g(v, k) being a measure of reliability of valuesof the components v(k) of the pattern vector v, and minimum outputvalues being optimal.
 5. The method as claimed in claim 1, wherein theoutput values a(i) are calculated using the relationship ##EQU19## fromthe pattern vector v and the weighting factors m(i), and wherein maximumoutput values are optimal.
 6. The method as claimed in claim 1, whereinsaid position parameters include rotational angles referenced to threeaxes of a spatial coordinate system.
 7. A method for estimating positionparameters of an object in space from a two-dimensional image using aKohonen network having neurons, comprising the steps of:a) prescribingindexable weighting vectors m(i)=(m(i,1), . . . , m(i,M)) of a dimensionM and positional parameters x(i)=(x(i,1), . . . , x(i,N)) of a dimensionN, allocated to the weighting factors, by whole numbers i thatunambiguously identify each neuron i of the Kohonen network; b) forminga pattern vector v=(v(1), . . . , v(k), . . . , v(M)) fromtwo-dimensional coordinates p(k), g(k) of feature points in the image,whereby v(k) describes a respective feature point in the image in arespective form v(k)=(p(k),q(k)); c) applying the pattern vectorv=(v(1), . . . v(k), . . . , v(M)) to each neuron i of the Kohonennetwork; d) calculating a respective output value a(i)=a(v,m(i)) for theapplied pattern vector v=(v(1), . . . , v(k), . . . , v(M)) in eachneuron i of the Kohonen network; e) deriving sought positionalparameters from respective positional parameters x(i)=(x(i,1), . . . ,x(i,N)) allocated to a neuron i having an optimum output valuea(v,m(i)); f) calculating an optimum of a function

    A(u,b)=a(v,b·m(w)+(1-b)·m(u))

of a real variable b with 0≦b≦1 for all grid points u in a predeterminedenvironment U of a grid point w that is determined by the neuron ihaving the optimum output value a(v,m(i)); g) identifying a grid point,opt, for which A(opt,b(opt)) is optimum among all A(u,b(u)) for all gridpoints u of the environment in the grid point w, whereby b(u) identifiesthe position of the optimum of A(u,b) as a function of b in an interval0≦b≦1; and h) deriving the sought positional parameters x(i) from therelationship

    x(i)=b(opt)·x(w)+(1-b(opt))·x(opt)

as a convex linear combination from positional parameters of two gridpoints.
 8. The method as claimed in claim 7, wherein the output valuesa(i) are calculated using the relationship ##EQU20## from the patternvector v and the weighting vectors m(i), and wherein minimum outputvalues are optimal.
 9. The method as claimed in claim 7, wherein theoutput values a(i) are calculated using the relationship ##EQU21## fromthe pattern vector v and the weighting factors m(i), weights g(v, k)being a measure of reliability of values of the components v(k) of thepattern vector v, and minimum output values being optimal.
 10. Themethod as claimed in claim 7, wherein the output values a(i) arecalculated using the relationship ##EQU22## from the pattern vector vand the weighting factors m(i), and wherein maximum output values areoptimal.
 11. The method as claimed in claim 7, wherein said positionparameters include rotational angles referenced to three axes of aspatial coordinate system.
 12. The method as claimed in claim 7, whereinthe method further comprises using, in addition to the positionalparameters, object identities for recognition of objects in the step ofadapting.
 13. A method for estimating position parameters of an objectin space from a two-dimensional image using a Kohonen network havingneurons, comprising the steps of:a) prescribing indexable weightingvectors m(i)=(m(i,1), . . . , m(i,M)) of a dimension M and positionalparameters x(i)=(x(i,1), . . . , x(i,N)) of a dimension N, allocated tothe weighting factors, by whole numbers i that unambiguously identifyeach neuron i of the Kohonen network; b) forming a pattern vectorv=(v(1), . . . , v(k), . . . , v(M)) from two-dimensional coordinatesp(k), g(k) of feature points in the image, whereby v(k) describes arespective feature point in the image in a respective formv(k)=(p(k),q(k)); c) applying the pattern vector v=(v(1), . . . , v(k),. . . , v(M)) to each neuron i of the Kohonen network; d) calculating arespective output value a(i)=a(v,m(i)) for the applied pattern vectorv=(v(1), . . . , v(k), . . . . , v(M)) in each neuron i of the Kohonennetwork; e) deriving sought positional parameters from respectivepositional parameters x(i)=(x(i, 1), . . . , x(i,N)) allocated to aneuron i having an optimum output value a(v,m(i)); f) calculating anoptimum of a function ##EQU23## of a real variable b in an interval0≦b≦1 for all grid points u in a predetermined environment U of a gridpoint w that is determined by the neuron i having the optimum outputvalue a(v,m(i)); and g) deriving the sought positional parameters x(i)from the relationship ##EQU24## as a convex linear combination ofpositional parameters x(w) and x(u).
 14. The method as claimed in claim13, wherein the output values a(i) are calculated using the relationship##EQU25## from the pattern vector v and the weighting vectors m(i), andwherein minimum output values are optimal.
 15. The method as claimed inclaim 13, wherein the output values a(i) are calculated using therelationship ##EQU26## from the pattern vector v and the weightingfactors m(i), weights g(v, k) being a measure of reliability of valuesof the components v(k) of the pattern vector v, and minimum outputvalues being optimal.
 16. The method as claimed in claim 13, wherein theoutput values a(i) are calculated using the relationship ##EQU27## fromthe pattern vector v and the weighting factors m(i), and wherein maximumoutput values are optimal.
 17. The method as claimed in claim 13,wherein said position parameters include rotational angles referenced tothree axes of a spatial coordinate system.
 18. The method as claimed inclaim 13, wherein the method further comprises using, in addition to thepositional parameters, object identities for recognition of objects inthe step of adapting.
 19. A method for adapting weighting vectors forestimating positional parameters of an object in space from atwo-dimensional image using a Kohonen network having neurons, theneurons i forming a grid, comprising the steps of:a) prescribingindexable weighting vectors m(i)=(m(i,1), . . . , m(i,M)) of a dimensionM and positional parameters x(i)=(x(i,1), . . . , x(i,N)) of a dimensionN, allocated to the weighting factors, by whole numbers i thatunambiguously identify each neuron i of the Kohonen network; b) forevery time t of the adaptation,b1) prescribing a learning rate L(t) thatmonotonously decreases with time; b2) prescribing a degree of couplingfactor h(i,j,t) that allocates a degree of coupling that decreasesmonotonously with spacing of points of the grid and monotonously withtime to two respective points (i(1,i(2),i(3), (j(1),j(2),j(3)) on thegrid that is formed by the neurons i of the Kohonen network; b3)prescribing a training pattern vector v(t)=(v(1,t), . . . , v(M, t)) ofdimension M that is formed of two-dimensional coordinates of featurepoints in the image and positional parameters XT(t)=(XT(1,t), . . . ,XT(N,t)) associated with said training pattern vector; b4) calculatingmomentary output values a(i,t)=a(v(t), m(i,t)) at time t for each neuroni from the prescribed training pattern vector v(t) and from momentaryweighting vectors m(i,t) at time t; b5) calculating a neuron i withoptimum output value a(v,m(i)), said neuron i having an associated gridpoint w; b6) adapting weighting vectors m(i,t) of every neuron i usingthe relationships

    m(i,t+1)=m(i,t)+L(t)·h(i,w,t)·(v(t)-m(i,t))

    x(i,t+1)=x(i,t)+L(t)·h(i,w,t)·(XT(t)-x(i,t)).


20. The method according to claim 5, wherein object identities forrecognition of objects are additionally taken into consideration in theadaptation.
 21. The method as claimed in claim 19, wherein the methodfurther comprises using a learning rate L(t) of a form ##EQU28## withprescribed initial and final learning rates L(start) and L(end)respectively, up to a final instant t(end) of the adaptation.
 22. Themethod according to claim 19, wherein the method further comprises usinga coupling strength function h(i, j, t) of a form ##EQU29## and withprescribed initial and final ranges s(start) and s(end), respectively,up to a final instant t(end) of the adaptation.
 23. The method asclaimed in claim 19, wherein the pattern vectors v=(v(1), . . . , v(M))in the form of v(k)=(p(k), q(k)) are formed from two-dimensionalcoordinates p(k), q(k) of feature points k=1, . . . , M in the image.24. The method as claimed in claim 19, wherein the output values a(i)are calculated using the relationship ##EQU30## from the pattern vectorv and the weighting vectors m(i), and wherein minimum output values areoptimal.
 25. The method as claimed in claim 19, wherein the outputvalues a(i) are calculated using the relationship ##EQU31## from thepattern vector v and the weighting factors m(i), weights g(v, k) being ameasure of reliability of values of the components v(k) of the patternvector v, and minimum output values being optimal.
 26. The method asclaimed in claim 19, wherein the output values a(i) are calculated usingthe relationship ##EQU32## from the pattern vector v and the weightingfactors m(i), and wherein maximum output values are optimal.
 27. Themethod as claimed in claim 19, wherein said position parameters includerotational angles referenced to three axes of a spatial coordinatesystem.